Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CoreML] support coreml model cache #23065

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open

[CoreML] support coreml model cache #23065

wants to merge 34 commits into from

Conversation

wejoncy
Copy link
Contributor

@wejoncy wejoncy commented Dec 10, 2024

Description

Refactor compute plan profiling

Support cache coreml model to speed up session initialization. this is only support by user provided entry and user responsible to manage the cache

With the cache, session initialization time can be reduced by 50% or more:

model before after
yolo11.onnx 0.6s 0.1s
yolo11-fp16.onnx 1.8s 0.1s

Motivation and Context

@wejoncy wejoncy requested a review from skottmckay December 10, 2024 10:26
@wejoncy wejoncy marked this pull request as ready for review December 10, 2024 10:26
@wejoncy wejoncy linked an issue Dec 10, 2024 that may be closed by this pull request
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from 5bfc8eb to fc9db07 Compare December 10, 2024 11:10
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from d539da2 to 1d1c874 Compare December 10, 2024 12:42
@wejoncy wejoncy force-pushed the jicwen/coreml_cache branch from 81c2b9e to 7b11848 Compare December 16, 2024 06:16
Comment on lines 115 to 119
if (require_static_shape_) {
model_cache_path_ += "/static_shape";
} else {
model_cache_path_ += "/dynamic_shape";
}
Copy link
Contributor

@skottmckay skottmckay Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is this required? Would be good to keep this as simple as possible. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It's required.

require_static_shape_ or not will determinate the what the sub-graph looks like. And gen_metadef_name didn't check the input/output shape info.

// } else {
// // save to ModelCachePath
// }
// we wound not detect if the cached model match the onnx subgraph, so User should carefully manage the cache for a new model.
Copy link
Contributor

@skottmckay skottmckay Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document what we do and don't do here.

  • Prefer cache key from model metadata
    • we could include example python here, and skip the actual implementation of the hashing to keep it simple
  • Use model path if available

If model changes user must do one of:

  • Set different cache key in model metadata
  • Load model from a different path
  • Delete old cache information #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can document fully in CoreML-ExecutionProvider.md and include a link to that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

// // save to ModelCachePath
// }
// we wound not detect if the cached model match the onnx subgraph, so User should carefully manage the cache for a new model.
static const char* const kCoremlProviderOption_ModelCachePath = "ModelCachePath";
Copy link
Contributor

@skottmckay skottmckay Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a const for the model metadata key name here. I'd vote for COREML_CACHE_KEY given the usage is CoreML specific. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.
May the cache_key can be used by other EPs.

Comment on lines 408 to 409
ORT_ENFORCE(std::count(subgraph_name.begin(), subgraph_name.end(), '_') == 3,
"Unexpected graph name format: ", subgraph_name);
Copy link
Contributor

@skottmckay skottmckay Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the cache is an optional feature, it might be better to disable caching and log the error instead of throwing, as it could break an iOS app completely (e.g. if they don't have logic to explicitly turn off caching it will be broken). #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be possible to throw as long as CoreML EP developers didn't modify gen_metadef_name or modify both.

User can unset 'coreml_options.ModelCachePath()' to disable cache.

But it should be fine to remove this check.

Comment on lines 69 to 70
// model_hash is a 64-bit hash value of model_path
user_provide_key = std::to_string(model_hash);
Copy link
Contributor

@skottmckay skottmckay Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true if the model path is available. If not it hashes the graph input names and all the node output names. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have documented it in CoreML EP.
I will add more comments here.

onnxruntime/core/providers/coreml/coreml_options.cc Outdated Show resolved Hide resolved
main_graph = main_graph->ParentGraph();
}
if (main_graph->GetModel().MetaData().count("CACHE_KEY") > 0) {
user_provide_key = graph_viewer.GetGraph().GetModel().MetaData().at("CACHE_KEY");
Copy link
Contributor

@skottmckay skottmckay Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to validate this with something like std::isalnum to guarantee it will be valid for use in the filesystem. I'd also suggest we enforce a maximum length to also try and avoid issues creating folders/files with this in the name. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. we should do it.
For now, cache-key should has at most 32 characters. Or we will re-hash it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 chars + null is required to store a sha256 hash as hex so maybe we could use that as the limit?

Copy link
Contributor Author

@wejoncy wejoncy Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64 comes to my mind in the first thought, would this be too long?
Let me use 64 as the limit again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a maximum so the user isn't forced to use it all. Allowing a 256 bit hash seems reasonable. Max directory/file name limit is 255 chars so 64 is well under that. Max path is somewhere around 1000 chars. User also controls the cache path.

But we should be careful about how deep the directory names get for the cache files. It might make more sense to shorten things like whether static shapes were enabled, and for the model it's really only the subgraph id that matters.

i.e. including the user hash and the model hash in a directory name (userhash_COREML_modelhash_subgraphid) shouldn't actually matter when all files for the model are stored under a top-level directory name that uses the preferred hash, as there a) should never be files from any other models in there, and b) the subgraphid should be deterministic (unless they run with a different optimization level).

Take a look at how long the paths are in your tests and figure out a good balance between readable/safe and avoiding exceeding the max path length.

Copy link
Contributor Author

@wejoncy wejoncy Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It's quite balanced now.
<user_provided_path>/cc6b2111b15dcdcf00ed6647c4430315e01616efde46ab4109f80fd1d3c46731/0_dynamic_mlprogram/model/

@wejoncy wejoncy requested a review from skottmckay December 23, 2024 07:57
@skottmckay
Copy link
Contributor

Are there any unit tests for the new code? We should be able to test that the expected cache files are created in the right places and that things like invalid cache key values are rejected.

@wejoncy
Copy link
Contributor Author

wejoncy commented Dec 24, 2024

Are there any unit tests for the new code? We should be able to test that the expected cache files are created in the right places and that things like invalid cache key values are rejected.

Added unit test for three cases where hash is valid or invalid.

wejoncy added 2 commits December 24, 2024 15:18
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

,

@microsoft microsoft deleted a comment from github-actions bot Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CoreML - Writing CoreML Model on every inference session creation
2 participants